AITopics | optimal model

Collaborating Authors

optimal model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Trade-offbetweenPayoffandModelRewardsin Shapley-FairCollaborativeMachineLearning

Neural Information Processing SystemsFeb-19-2026, 11:24:33 GMT

Hence, a "fair" reward allocation scheme is desirable to give all parties enough incentives to join the collaboration.

artificial intelligence, machine learning, shapley value, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Using Noise to Infer Aspects of Simplicity Without Learning Zachery Boner 1 Harry Chen

Neural Information Processing SystemsFeb-18-2026, 15:12:47 GMT

Noise in data significantly influences decision-making in the data science process. In fact, it has been shown that noise in data generation processes leads practitioners to find simpler models. However, an open question still remains: what is the degree of model simplification we can expect under different noise levels? In this work, we address this question by investigating the relationship between the amount of noise and model simplicity across various hypothesis spaces, focusing on decision trees and linear models. We formally show that noise acts as an implicit regularizer for several different noise models. Furthermore, we prove that Rashomon sets (sets of near-optimal models) constructed with noisy data tend to contain simpler models than corresponding Rashomon sets with non-noisy data. Additionally, we show that noise expands the set of "good" features and consequently enlarges the set of models that use at least one good feature. Our work offers theoretical guarantees and practical insights for practitioners and policymakers on whether simple-yet-accurate machine learning models are likely to exist, based on knowledge of noise levels in the data generation process.

artificial intelligence, machine learning, noise, (17 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > Florida > Broward County (0.04)
North America > Dominican Republic (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Government (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Scalable branch-and-bound model selection with non-monotonic criteria including AIC, BIC and Mallows's $\mathit{C_p}$

Vanhoefer, Jakob, Körner, Antonia, Doresic, Domagoj, Hasenauer, Jan, Pathirana, Dilan

arXiv.org Machine LearningDec-16-2025

Model selection is a pivotal process in the quantitative sciences, where researchers must navigate between numerous candidate models of varying complexity. Traditional information criteria, such as the corrected Akaike Information Criterion (AICc), Bayesian Information Criterion (BIC), and Mallows's $\mathit{C_p}$, are valuable tools for identifying optimal models. However, the exponential increase in candidate models with each additional model parameter renders the evaluation of these criteria for all models -- a strategy known as exhaustive, or brute-force, searches -- computationally prohibitive. Consequently, heuristic approaches like stepwise regression are commonly employed, albeit without guarantees of finding the globally-optimal model. In this study, we challenge the prevailing notion that non-monotonicity in information criteria precludes bounds on the search space. We introduce a simple but novel bound that enables the development of branch-and-bound algorithms tailored for these non-monotonic functions. We demonstrate that our approach guarantees identification of the optimal model(s) across diverse model classes, sizes, and applications, often with orders of magnitude computational speedups. For instance, in one previously-published model selection task involving $2^{32}$ (approximately 4 billion) candidate models, our method achieves a computational speedup exceeding 6,000. These findings have broad implications for the scalability and effectiveness of model selection in complex scientific domains.

backward selection, optimal model, selection, (16 more...)

arXiv.org Machine Learning

2512.12221

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Equitable Survival Prediction: A Fairness-Aware Survival Modeling (FASM) Approach

Liu, Mingxuan, Ning, Yilin, Wang, Haoyuan, Hong, Chuan, Engelhard, Matthew, Bitterman, Danielle S., La Cava, William G., Liu, Nan

arXiv.org Artificial IntelligenceOct-24-2025

As machine learning models become increasingly integrated into healthcare, structural inequities and social biases embedded in clinical data can be perpetuated or even amplified by data-driven models. In survival analysis, censoring and time dynamics can further add complexity to fair model development. Additionally, algorithmic fairness approaches often overlook disparities in cross-group rankings, e.g., high-risk Black patients may be ranked below lower-risk White patients who do not experience the event of mortality. Such misranking can reinforce biological essentialism and undermine equitable care. We propose a Fairness-Aware Survival Modeling (FASM), designed to mitigate algorithmic bias regarding both intra-group and cross-group risk rankings over time. Using breast cancer prognosis as a representative case and applying FASM to SEER breast cancer data, we show that FASM substantially improves fairness while preserving discrimination performance comparable to fairness-unaware survival models. Time-stratified evaluations show that FASM maintains stable fairness over a 10-year horizon, with the greatest improvements observed during the mid-term of follow-up. Our approach enables the development of survival models that prioritize both accuracy and equity in clinical decision-making, advancing fairness as a core principle in clinical care.

artificial intelligence, fairness, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2510.20629

Country: North America > United States (0.93)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

c50c42f853db0f1f5b4195358b6d97de-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-11-2025, 00:03:52 GMT

artificial intelligence, budget constraint, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Using Noise to Infer Aspects of Simplicity Without Learning Zachery Boner 1 Harry Chen

Neural Information Processing SystemsOct-10-2025, 20:50:16 GMT

dataset, noise, rashomon, (15 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > Florida > Broward County (0.04)
North America > Dominican Republic (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Government (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

To all reviewers, thank you very much for your thoughtful comments and suggestions

Neural Information Processing SystemsOct-3-2025, 09:16:35 GMT

To all reviewers, thank you very much for your thoughtful comments and suggestions. R#1: "...importance of similarity among the selected tasks... " R#1: "...domain randomization, when enough samples are used, is a better alternative to meta-learning... " R#2: "...Theorems 1 and 2 are asymptotic... " Hence, the theorems are NOT asymptotic. We will remove the asymptotic parts for clarity. R#2: 'Assumption 2 ... the per-task optimal models are centered around the corresponding optimal solutions. This assumption can easily be dropped with the cost of including the distance as a term.

maml, thoughtful comment and suggestion, trade-off, (14 more...)

Neural Information Processing Systems

Genre: Research Report (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Add feedback

Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions

Neural Information Processing SystemsOct-3-2025, 08:27:23 GMT

In this paper, we study the problem of correcting for distribution shift using mixture search.

algorithm, dataset, mixture distribution, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Ohio (0.04)
North America > United States > Connecticut (0.04)
North America > Canada (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

0c72cb7ee1512f800abe27823a792d03-Supplemental.pdf

Neural Information Processing SystemsOct-2-2025, 00:45:57 GMT

accumulative metric, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)

Add feedback

Scaling Laws of Motion Forecasting and Planning -- Technical Report

Baniodeh, Mustafa, Goel, Kratarth, Ettinger, Scott, Fuertes, Carlos, Seff, Ari, Shen, Tim, Gulino, Cole, Yang, Chenjie, Jerfel, Ghassen, Choe, Dokook, Wang, Rui, Charrow, Benjamin, Kallem, Vinutha, Casas, Sergio, Al-Rfou, Rami, Sapp, Benjamin, Anguelov, Dragomir

arXiv.org Artificial IntelligenceSep-9-2025

We study the empirical scaling laws of a family of encoder-decoder autoregressive transformer models on the task of joint motion forecasting and planning in the autonomous driving domain. Using a 500 thousand hours driving dataset, we demonstrate that, similar to language modeling, model performance improves as a power-law function of the total compute budget, and we observe a strong correlation between model training loss and model evaluation metrics. Most interestingly, closed-loop metrics also improve with scaling, which has important implications for the suitability of open-loop metrics for model development and hill climbing. We also study the optimal scaling of the number of transformer parameters and the training data size for a training compute-optimal model. We find that as the training compute budget grows, optimal scaling requires increasing the model size 1.5x as fast as the dataset size. We also study inference-time compute scaling, where we observe that sampling and clustering the output of smaller models makes them competitive with larger models, up to a crossover point beyond which a larger models becomes more inference-compute efficient. Overall, our experimental results demonstrate that optimizing the training and inference-time scaling properties of motion forecasting and planning models is a key lever for improving their performance to address a wide variety of driving scenarios. Finally, we briefly study the utility of training on general logged driving data of other agents to improve the performance of the ego-agent, an important research area to address the scarcity of robotics data for large capacity models training.

compute, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2506.08228

Genre: Research Report > New Finding (0.48)

Industry: